Kafka/long lived producer by guillermodotn · Pull Request #83 · release-engineering/cts

guillermodotn · 2026-06-26T11:11:35Z

Addresses sections 3 and 4 commented on PR #82

3.- There is batching of messages. The producer does this automatically (partially).
4.- The producer is created for each batch of messages. Kafka seems to prefer a long lived producer that is reused.

Comment ref: #82 (review)
Stacked on: #82

codecov-commenter · 2026-06-26T11:18:11Z

Codecov Report

❌ Patch coverage is 84.61538% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.73%. Comparing base (a24ff29) to head (da3334f).

Files with missing lines	Patch %	Lines
cts/messaging.py	84.61%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #83      +/-   ##
==========================================
+ Coverage   83.69%   83.73%   +0.03%     
==========================================
  Files          13       13              
  Lines        1325     1328       +3     
==========================================
+ Hits         1109     1112       +3     
  Misses        216      216

Flag	Coverage Δ
unit-tests	`83.73% <84.61%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lubomir

I like the general direction. However, I think we should wait with further changes until we confirm in stage that the code is indeed working so that we don't have to chase a moving target.

lubomir · 2026-06-26T12:30:44Z

            for msg in msgs:
                event = msg.get("event", "event")
                topic = "%s%s" % (conf.messaging_topic_prefix, event)
                producer.send(topic, msg)


Does this actually ever raise any exceptions? It returns a Future immediately, so I would not expecte any network issues to appear as exceptions. Adding back the flush() might help.

flush() was dropped to let Kafka handle batching. CTS sends few messages, and linger_ms=0 means they're sent almost immediately anyway.

But you're right, without flush(), send() just returns a Future and delivery errors are never raised. The error recovery code would never trigger. I'll add it back.

The built-in batching is a good point though. I didn't think about that. Maybe it's the error handling that should be removed?

If the retry logic is also handled by Kafka itself, I guess the flush() and error handling can be removed. We would just be missing the ability to log delivery statuses on CTS's side.

lubomir · 2026-06-26T12:32:46Z

+                    _kafka_producer.close()
+                except Exception:
+                    pass
+                _kafka_producer = None


This code seems rather fragile. There's a helper to create the producer, but here we still need to touch the global variable directly. Does KafkaProducer have some reconnection logic we could use instead?

Roger, relying on Kafka's built-in reconnection and retry API instead.

Just realized that we will need to keep _retry_with_backoff for UBM compatibility.

But not a blocker.

guillermodotn requested a review from lubomir June 26, 2026 11:11

refactor(messaging): use a long-lived Kafka producer

da3334f

guillermodotn force-pushed the kafka/long-lived-producer branch from 08ef23e to da3334f Compare June 26, 2026 11:15

lubomir reviewed Jun 26, 2026

View reviewed changes

guillermodotn marked this pull request as draft June 26, 2026 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kafka/long lived producer#83

Kafka/long lived producer#83
guillermodotn wants to merge 1 commit into
release-engineering:mainfrom
guillermodotn:kafka/long-lived-producer

guillermodotn commented Jun 26, 2026

Uh oh!

codecov-commenter commented Jun 26, 2026

Uh oh!

lubomir left a comment

Uh oh!

lubomir Jun 26, 2026

Uh oh!

guillermodotn Jun 26, 2026

Uh oh!

lubomir Jun 29, 2026

Uh oh!

guillermodotn Jun 30, 2026

Uh oh!

lubomir Jun 26, 2026

Uh oh!

guillermodotn Jun 26, 2026

Uh oh!

guillermodotn Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

guillermodotn commented Jun 26, 2026

Uh oh!

codecov-commenter commented Jun 26, 2026

Codecov Report

Uh oh!

lubomir left a comment

Choose a reason for hiding this comment

Uh oh!

lubomir Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

guillermodotn Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

lubomir Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

guillermodotn Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

lubomir Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

guillermodotn Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

guillermodotn Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants